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Abstract — Cognitive  radio  (CR)  techniques  allow  unlicensed 
secondary  users  (SUs)  to  opportunistically  access  underutilized 
primary  channels  that  are  licensed  to  primary  users  (PUs).  We 
consider  a  multi-primary  channel  scenario  in  which  the  SUs 
cooperatively  try  to  find  these  primary  channel  spectrum  holes 
by  limited  spectrum  sensing.  The  objective  is  to  design  the 
optimal  sensing  and  accessing  policy  that  maximizes  the  total 
secondary  system  throughput  on  the  primary  channels  accrued 
over  time,  while  satisfying  a  constraint  on  the  probability  of 
colliding  with  licensed  transmissions.  Although  the  problem  can 
be  formulated  as  a  Partially  Observable  Markov  Decision  Process 
(POMDP),  the  optimal  solutions  are  often  intractable.  As  a  result, 
we  find  the  optimal  myopic  channel  sensing  policy  that  maximizes 
instantaneous  total  secondary  system  throughput  on  the  primary 
channels  at  each  time.  The  contributions  of  this  paper  include: 

1)  developing  a  universal  optimal  myopic  channel  sensing  policy 
that  is  applicable  for  any  number  of  primary  channels,  any 
number  of  SUs  and  any  channel  coefficients  (assumed  known); 

2)  formulation  of  a  centralized  spectrum  sensing  and  decision¬ 
making  architecture  for  cognitive  secondary  systems  that  allow 
exploitation  of  all  available  spectrum  white  spaces  across  the 
whole  primary  spectrum.  We  compare  our  combined  sensing  and 
accessing  strategies  with  other  proposed  strategies  and  show  that 
our  proposed  strategy  outperforms  them  in  terms  of  the  resulting 
total  secondary  system  throughput  under  the  same  constraints 
on  collision  with  primary  users. 

Index  Terms — Cognitive  radios,  dynamic  spectrum  access, 
Markov  chains,  Neyman-Pearson  detector,  myopic  sensing. 

I.  Introduction 

IT  IS  now  widely  accepted  that  a  large  number  of  licensed 
communication  channels  in  a  wide  range  of  frequency 
bands  are  under  utilized  [1],  Dynamic  spectrum  sharing  (DSS) 
techniques  implemented  on  CR  platforms  are  proposed  as 
a  method  to  improve  the  utilization  of  the  scarce  commu¬ 
nication  spectrum  resources.  To  achieve  this,  though,  CRs 
must  have  the  ability  to  measure,  to  sense,  and  to  learn  the 
channel  characteristics  and  availabilities  so  that  they  can  adjust 
their  transmission  and/or  reception  parameters  to  communicate 
efficiently  while  avoiding  interference  with  licensed  and/or 
unlicensed  users  [2]. 

In  this  paper,  we  consider  a  centralized  multi-primary  chan¬ 
nel  scenario  in  which  the  SUs  cooperatively  try  to  find  and 
access  the  primary  channel  spectrum  holes  (white  spaces)  by 
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limited  spectrum  sensing.  The  objective  of  this  problem  is 
to  design  the  optimal  secondary  system  channel  sensing  and 
accessing  policy  that  maximizes  the  total  secondary  system 
throughput  on  the  primary  channels  accrued  over  time,  while 
satisfying  a  constraint  on  the  probability  of  colliding  with  li¬ 
censed  transmissions.  Although  the  problem  can  be  formulated 
as  a  Partially  Observed  Markov  Decision  Process  (POMDP) 
problem,  the  optimal  solutions  are  often  intractable  due  to  the 
exponential  computation  complexity  [3].  As  a  result,  in  this 
paper  we  seek  the  optimal  myopic  channel  sensing  policy  that 
maximizes  instantaneous  total  secondary  system  throughput 
on  the  primary  channels  at  each  time,  without  considering  the 
impact  to  future  expected  total  secondary  system  throughput. 
We  assume  that  the  decision-making  (sensing  and  access)  in 
the  CR  communication  system  is  centralized:  a  central  unit 
gathers  all  channel  sensing  results  from  SUs  over  a  dedicated 
control  channel;  the  decisions  of  sensing  and  access  are  made 
at  the  central  unit  and  informed  to  the  distributed  cognitive 
SUs  over  the  same  dedicated  control  channel.  We  call  this 
central  unit  the  secondary  system  decision  center  (SSDC).  We 
model  each  primary  channel  occupancy  dynamics  as  two-state 
(idle/busy)  independent  Markov  chains.  We  assume  that  the 
state  transition  probabilities  of  the  channel  Markov  model  are 
known.  A  method  provided  to  estimate  these  channel  state 
transition  probabilities  can  be  found  in  [4]  by  formulating  the 
primary  channel  sensing  problem  using  the  Hidden  Markov 
Models  (HMM).  In  order  to  compare  the  performance  to  the 
sensing/acess  policy  in  [5],  we  consider  the  same  assump¬ 
tion:  secondary  system  has  perfect  knowledge  of  the  primary 
signaling.  Clearly,  this  is  not  always  the  case  and  indeed  in 
many  situations  may  not  be  realistic.  Other  sensing  strategies 
such  as  waveform  based  sensing  and  cyclostationarity  based 
sensing  [2]  (when  partial  knowledge  about  primary  signaling 
is  available)  can  easily  replace  the  matched-filter  based  sensing 
in  our  model  and  are  expected  to  perform  no  better  than  the 
matched-filter  based  sensing.  When  the  secondary  system  has 
no  knowledge  of  the  primary  signals,  energy  detector  based 
strategies  are  adopted  generally  [2] . 

Many  schemes  presented  in  literature  such  as  in  [6],  [3], 
and  [5]  have  previously  formulated  the  dynamic  spectrum 
access  (DSA)  problem  as  a  POMDP  problem,  but  have  left 
the  problem  unsolved  because  the  optimal  solutions  are  often 
intractable  due  to  the  exponential  computation  complexity.  In 
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including  suggestions  for  reducing  this  burden,  to  Washington  Headquarters  Services,  Directorate  for  Information  Operations  and  Reports,  1215  Jefferson  Davis  Highway,  Suite  1204,  Arlington 
VA  22202-4302.  Respondents  should  be  aware  that  notwithstanding  any  other  provision  of  law,  no  person  shall  be  subject  to  a  penalty  for  failing  to  comply  with  a  collection  of  information  if  it 
does  not  display  a  currently  valid  OMB  control  number. 


1.  REPORT  DATE 

MAY  2011 


2.  REPORT  TYPE 


4.  TITLE  AND  SUBTITLE 

Optimal  Myopic  Sensing  and  Dynamic  Spectrum  Access  in  Centralized 
Secondary  Cognitive  Radio  Networks  with  low-complexity 
Implementations 

6.  AUTHOR(S) 


7.  PERFORMING  ORGANIZATION  NAME(S)  AND  ADDRESS(ES) 

University  of  New  Mexcio, Department  of  Electrical  and  Computer 
Engineering  , Albuquerque, NM, 87131 

9.  SPONSORING/MONITORING  AGENCY  NAME(S)  AND  ADDRESS(ES) 


3.  DATES  COVERED 

00-00-2011  to  00-00-2011 

5a.  CONTRACT  NUMBER 

5b.  GRANT  NUMBER 

5c.  PROGRAM  ELEMENT  NUMBER 

5d.  PROJECT  NUMBER 

5e.  TASK  NUMBER 

5f.  WORK  UNIT  NUMBER 

8.  PERFORMING  ORGANIZATION 
REPORT  NUMBER 


10.  SPONSOR/MONITOR'S  ACRONYM(S) 

11.  SPONSOR/MONITOR'S  REPORT 
NUMBER(S) 


12.  DISTRIBUTION/AVAILABILITY  STATEMENT 

Approved  for  public  release;  distribution  unlimited 

13.  SUPPLEMENTARY  NOTES 

IEEE  Vehicular  Technology  Conference  (VTC-spring’2011),  Budapest,  Hungary,  May  2011. 

14.  ABSTRACT 

Cognitive  radio  (CR)  techniques  allow  unlicensed  secondary  users  (SUs)  to  opportunistically  access 
underutilized  primary  channels  that  are  licensed  to  primary  users  (PUs).  We  consider  a  multi-primary 
channel  scenario  in  which  the  SUs  cooperatively  try  to  find  these  primary  channel  spectrum  holes  by 
limited  spectrum  sensing.  The  objective  is  to  design  the  optimal  sensing  and  accessing  policy  that 
maximizes  the  total  secondary  system  throughput  on  the  primary  channels  accrued  over  time,  while 
satisfying  a  constraint  on  the  probability  of  colliding  with  licensed  transmissions.  Although  the  problem 
can  be  formulated  as  a  Partially  Observable  Markov  Decision  Process  (POMDP),  the  optimal  solutions  are 
often  intractable.  As  a  result  we  find  the  optimal  myopic  channel  sensing  policy  that  maximizes 
instantaneous  total  secondary  system  throughput  on  the  primary  channels  at  each  time.  The  contributions 
of  this  paper  include  1)  developing  a  universal  optimal  myopic  channel  sensing  policy  that  is  applicable  for 
any  number  of  primary  channels,  any  number  of  SUs  and  any  channel  coefficients  (assumed  known)  2) 
formulation  of  a  centralized  spectrum  sensing  and  decisionmaking  architecture  for  cognitive  secondary 
systems  that  allow  exploitation  of  all  available  spectrum  white  spaces  across  the  whole  primary  spectrum. 
We  compare  our  combined  sensing  and  accessing  strategies  with  other  proposed  strategies  and  show  that 
our  proposed  strategy  outperforms  them  in  terms  of  the  resulting  total  secondary  system  throughput  under 
the  same  constraints  on  collision  with  primary  users. 

15.  SUBJECT  TERMS 


16.  SECURITY  CLASSIFICATION  OF: 

17.  LIMITATION  OF 

18.  NUMBER 

19a.  NAME  OF 

ABSTRACT 

OF  PAGES 

RESPONSIBLE  PERSON 

a.  REPORT 

unclassified 

b.  ABSTRACT 

unclassified 

c.  THIS  PAGE 

unclassified 

Same  as 
Report  (SAR) 

5 

all  these  works,  suboptimal  myopic  channel  sensing  solutions 
are  then  proposed  and  derived  under  certain  assumptions.  For 
example,  in  [6]  and  [3],  the  authors  developed  a  myopic  chan¬ 
nel  sensing  policy  under  the  assumption  of  a  certain  ordering 
of  Markov  state  transition  probabilities  and  the  assumption 
that  the  exact  transition  probabilities  are  unknown.  In  these 
two  papers,  the  structure  of  accessing  policies  are  not  provided 
since  the  authors  developed  the  myopic  channel  sensing  policy 
based  on  considering  the  maximum  expected  total  secondary 
system  throughput  on  the  primary  channels  at  each  time  in 
their  formulation.  Moreover,  the  adopted  model  is  not  realistic 
because  the  authors  assumed  perfect  sensing.  However,  in  our 
paper,  we  show  that  the  myopic  channel  sensing  policy  actually 
depends  on  the  probability  of  white-space  detection  (channel 
state  idle  being  the  target).  Thus  we  explicitly  express  the 
channel  access  policy  based  on  a  Neyman-Pearson  detector 
because  of  the  interference  constraint  imposed  by  the  PUs.  On 
the  other  hand,  [5]  assumed  that  all  SUs  are  to  be  assigned 
to  the  single  primary  channel  that  has  the  highest  belief 
of  being  idle  at  each  time  to  sense.  This  model  is  clearly 
wasteful  since  only  one  primary  channel  can  be  accessed  at 
each  time  no  matter  how  many  are  available.  This  restriction 
reduces  the  total  secondary  system  throughput  because  the 
transmission  opportunities  on  other  unsensed  channels  are 
missed  entirely.  On  the  contrary,  in  our  model,  we  consider 
different  time-varying  channel  fading  coefficients  for  different 
SUs  in  modeling  the  nature  of  the  wireless  channels.  As  a 
result,  our  myopic  channel  sensing  policy  exploits  the  spatial 
diversity  of  the  wireless  links  and  makes  the  sensing  decisions 
accordingly. 

Moreover,  our  myopic  channel  sensing  strategy  is  valid  for 
any  set  of  transition  probabilities,  any  number  of  primary 
channels  and/or  SUs,  whereas,  as  pointed  out  above,  schemes 
of  [6]  and  [3]  are  only  applicable  under  certain  conditions  on 
channel  transition  probabilities.  Our  method  is  also  applicable 
for  time-varying  primary  channel  fading  coefficients,  primary 
channels  with  different  band  widths  and/or  different  signal  - 
to-noise  ratios  (SNRs),  although  in  the  case  of  a  single 
primary  system  all  channel  bandwidths  may  be  identical  as 
also  assumed  in  the  simulations  in  [5]. 

The  remainder  of  the  paper  is  organized  as  follows:  In 
Section  II  we  introduce  the  system  model,  SU  observation 
model,  and  secondary  system  architecture.  In  Section  III,  the 
access  and  sensing  decisions  are  derived.  In  Section  IV  we 
show  simulation  results  using  our  combined  access/sensing 
approach  and  compare  with  other  existing  strategies.  Finally, 
in  Section  V  we  conclude  by  summarizing  our  results. 

II.  Problem  Formulation 
A.  Primary  channel  state  model 

We  use  k  =  {0, 1, 2,  •  •  •  }  to  denote  the  indices  of  an  infinite 
slotted  time  horizon.  We  assume  a  group  of  N  SUs,  and 
a  collection  of  M  primary  channels.  The  primary  channels 
are  modeled  as  statistically  identical  and  independent  two- 
state  ( busy  or  state  1  &  idle  or  state  0)  Markov  chains.  The 
state  busy  indicates  the  channel  is  occupied  by  PUs  so  that 
it  cannot  be  used  by  the  SUs;  the  state  idle  indicates  there 
is  no  PU  transmissions  over  the  channel  and  it  is  available 
for  SUs  to  access.  We  denote  the  true  state  of  primary 
channel  m  £  {1,  •  •  •  ,  M}  in  time  slot  k  by  Sm(k )  £  {0, 1}. 
We  assume  that  the  state  of  a  primary  channel  will  not 


change  within  a  single  time  slot.  The  stationary  transition 
probability  of  channel  m  from  state  i  to  j  is  defined  as 
Pij  =  P r{Sm(k  +  1)  =  j  |  Sm(k)  =  £  {0,1}.  The 

transition  probability  matrix  of  the  Markov  chain  is  denoted 
by  P  =  [poo  PoPPio  Pn\- 

When  a  SU  successfully  accesses  a  primary  channel  that  is 
actually  free  during  a  given  time  slot,  the  SU  is  assumed  to 
receive  a  reward  proportional  to  the  bandwidth  of  that  channel. 
If  a  SU  accesses  a  primary  channel  in  state  busy,  it  will  cause  a 
collision  with  PUs’  transmission  and  we  assume  the  SU  gets 
no  reward  in  this  case.  The  accumulated  total  reward  of  all 
SUs  is  used  as  a  measure  of  the  secondary  system  throughput 
over  the  primary  channels. 

B.  Secondary  system  decisions  making  architecture 

In  order  to  detect  the  spectrum  white  spaces,  SUs  perform 
channel  spectrum  sensing.  We  assume  that  the  secondary 
CRs  are  equipped  with  only  a  single  antenna.  As  a  result, 
when  a  SU  is  performing  channel  sensing,  no  simultaneous 
communication  can  be  performed  by  that  SU.  It  is  also 
assumed  that  a  single  SU  can  only  sense  one  channel  at  a 
time.  As  shown  in  Fig.  1,  SUs  sense  primary  channels  during 
the  designated  sensing  periods  at  the  beginning  of  each  time 
slot.  It  is  assumed  that  if  a  PU  intends  to  use  its  channel  during 
a  transmitting  period,  it  starts  to  transmit  from  the  beginning  of 
that  time  slot,  so  that  SUs  will  observe  a  busy  channel  during 
the  sensing  period.  On  the  other  hand,  we  assume  that  multiple 
SUs  can  simultaneously  sense  the  same  primary  channel. 


|  Sensing  Period  | 

|  Transmitting  Period  | 


Fig.  1.  Slotted  time  horizon  with  Sensing  Periods  and  Transmitting  Periods. 

As  mentioned  before,  the  SSDC  is  assumed  to  collect  all 
channel  sensing  results  (reports)  from  the  SUs  over  a  dedicated 
control  channel.  We  assume  that  the  SSDC  makes  decisions 
on  which  SU  (or  a  subset  of  SUs)  should  sense/access 
which  primary  channel:  i.e.  the  secondary  system  decisions 
are  centralized.  We  use  an  M  x  N  matrix  A*,  to  denote 
the  sensing  decisions  made  at  time  k,  where  Ak(m,  n)  £ 
{0, 1},  Vro  £  {1,  •  •  •  ,  M},  n  £  {1,  ■  ■  •  ,  N}.  The  matrix  entry 
A k(m,n)  =  1  or  0  stands  for  SU  n  should  or  should  not 
sense  primary  channel  m  at  time  /;:,  respectively.  Since  we 
assume  that  one  SU  can  only  sense  one  channel  at  a  time,  we 
have  the  constraint  Jfm—i  Afc(m,  n)  =  l,Vn  £  {1,  ■■■  ,N}. 
Let  AT m(k)  =  {n  |  Vn,  A/C(m,n)  =  1}  denote  the  set  of 
indices  of  SUs  that  are  assigned  to  sense  channel  m  during 

time  slot  k.  Similarly,  we  use  an  M  x  N  matrix  B/  to 

denote  the  access  decisions  at  time  k,  where  B *,(m,  n)  £ 
{0, 1},  Vm  £  {1,  •  •  •  ,  M},  n  £  {1,  •  •  •  ,  N}.  The  matrix  entry 
(■ to,  n )  =  1  or  0  stands  for  SU  n  should  or  should  not 

access  primary  channel  m  at  time  k,  respectively.  We  use 

M  x  N  matrix  to  denote  the  collection  of  observation 
results  from  all  SUs  on  their  assigned  primary  channels  at  time 
k  with  Yfc(m,  n)  =  ym,n(k),  where  ym,n{k )  is  the  report  from 
SU  n  to  the  SSDC  of  the  state  of  m-th  primary  user  at  time 
k.  This  local  decision  is  discussed  in  detail  in  the  next  sub¬ 
section.  To  make  access  decisions  at  each  time  k,  the  SSDC 
only  need  to  look  up  the  entries  Y k(m,  n),W(m ,  n),  such  that 


k 

k+l 

\ 

z 

\ 

Ak(m,n)  =  1.  The  secondary  system  action  procedure  is 
summarized  in  Algorithm  1. 


Algorithm  1  Secondary  system  decisions  making  architecture 

1.  At  each  time  k,  based  on  previous  knowledge  of  primary 
channels  and  channel  observations,  the  SSDC  sends  out  the 
sensing  decisions  A/,,  to  all  SUs. 

2.  SUs  perform  channel  sensing  according  to  A*,  and 
sensing  result  Yk  is  reported  back  to  the  SSDC. 

3.  Based  on  the  channel  sensing  result  Y k,  SSDC  send  out 
the  accessing  decisions  B/,.  to  all  SUs. 

4.  SUs  access  primary  channels  according  to  B/.. 

5.  For  k  — >  k  +  1,  repeat  1  through  5. 


C.  Secondary  user  sensing  models 

We  use  rmtn{k)  to  denote  the  observation  sample  on 
channel  m,  by  SU  n  £  {1,  -  -  -  ,N}  at  time  k :  rm,n(k)  = 
hm,n{k)xm(k)  +wn(k),  where  wn(k)  is  the  zero-mean  Gaus¬ 
sian  receiver  noise  with  variance  rr}  at  the  n-th  secondary 
receiver  (same  receiver  noise  variance  for  all  SUs),  and 
hrn.n(k)  is  the  fading  coefficient  between  the  m-th  primary 
transmitter  and  the  n-th  secondary  receiver  at  time  k.  The 
channel  coefficient  hm^n(k)  is  assumed  to  be  zero-mean 
Gaussian  distributed  with  variance  ak.  We  assume  that  the 
SSDC  has  perfect  knowledge  of  all  channel  coefficients  at  each 
time  k.  xrn  (k)  is  assumed  to  be  the  primary  signal  on  channel 
m  (corresponding  to  the  signal  from  m-th  primary  transmitter) 
at  time  k.  In  this  paper,  since  we  assume  the  secondary 
system  has  perfect  knowledge  of  the  primary  signaling,  we 
assume  that  the  SUs  use  matched-filter  based  sensing  and 
Xm{k)  =  Sm(k)  £  {0,1}. 

In  the  CR  context,  when  communication  opportunities  are 
scarce,  it  is  reasonable  to  assume  that  instead  of  transmitting 
raw  data  rmj7l(fc)’s,  the  SUs  can  only  transmit  quantized 
versions  of  primary  channel  observations  as  reports  to  the 
SSDC.  In  this  paper,  without  loss  of  generality,  we  assume  the 
simplest  case:  the  reports  from  SUs  to  the  SSDC  are  quantized 
to  0’s  and  l’s. 

We  use  ym,n{k)  £  {0,1}  to  denote  the  report  of  m-th 
primary  channel  state  from  SU  n  to  the  SSDC.  Transmitting 
ym,n{ky s  to  SSDC  are  assumed  to  be  error  free.  As  shown 
in  Fig.  2,  the  m-th  channel  true  state  Sm(k)  £  {0, 1}  and 
the  report  ym,n(k)  £  {0,1}  can  be  modeled  as  the  input  and 
output  of  a  Binary  Asymmetric  Channel  (BAC),  respectively. 
The  two  hypotheses  on  channel  m  are  TCi  :  Sm(k)  =  0 


Primary 
transmission 
on  channel  m 
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'V.W  ymn(k)=0ll 
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Fig.  2.  SUs’  reports  of  observations  on  primary  channels  can  be  modeled 
as  Binary  Asymmetric  Channels. 


and  Jf0  :  Sm(k )  =  1,  respectively.  We  use  A{(m,n),  and 
Xk(m,n)  to  denote  the  crossover  probability  under  JCi,  and 
Tfo,  respectively.  We  assume  ym,n{k)  is  quantized  based  on 
the  maximum  a  posteriori  probability  (MAP)  detector: 

1)  if  hm,n{k)  >  0, 


0 

1 


if  rk(m,  n)  <  r]'mn{k), 
if  rk(m,n)  >  v'm,n(k)- 


2)  if  hm,n{k )  <  0, 


0 

1 


if  rk(m,n)  >  rfm>n{k), 
if  rk(m,n)  <  r}'m  n{k). 


where  rf  (k)  = 


2(fc)2-21og(y7m(fc))ag 


2  hr. 


Pr{sm(fc)=i} 
Pr{sm(fc)=o}  ‘ 


.(U 


and  r]m(k)  = 


The  resulting  crossover  probabilities  can  be 
expressed  as:  1)  when  hm^n{k)  >  0,  A ]nn{k )  = 


Q  and  <„(*)  =  Q  (hm’nl [k)JmAk))^  2)  when 

hm,n(k)  <  0,  A l^k)  =  Q  (=%^)  and  A °m^k)  = 


III.  Access  and  Sensing  Decisions 


A.  Access  decisions  at  the  SSDC 

To  meet  the  requirement  of  keeping  the  collision  probability 
with  PUs  on  every  channel  under  a  given  constraint,  the 
optimal  access-decisions  at  the  SSDC  must  be  based  on  a 
Neyman-Pearson  detector  [7]  at  the  SSDC  as  the  access 
decision  rule.  For  simplicity,  we  use  the  variable  length  vector 
y fc(m, :)  =  {ym,n{k)  :  Vn  £  Afm(fc)}  to  denote  all  channel 
sensing  reports  at  time  k,  from  the  SUs  on  channel  m  and  the 
variable  length  vector  y0:fc(m, :)  =  {y0(m,  :),•••  ,yfe(m, :)} 
to  denote  the  sensing  history  on  channel  m,  from  time  0  to  k. 

At  time  k,  for  the  m-th  primary  channel,  the  SSDC  chooses 
one  of  the  two  possible  hypotheses  based  on  yo :fc(m, :): 


?fi  :  yo-.k(m,:)  ~  Pm,i  (channel  idle) 
Jfo  :  yo :k(m, :)  ~  Pm,o  (channel  busy), 


where  Pm  i,  and  /{n0  denote  the  conditional  probability 
density  of  the  vector  yo ;fe(m, :)  given  Sm(k )  =  0,  and 
Sm{k)  =  1,  respectively.  The  corresponding  likelihood  ratio 
detector  based  on  yo :fc(m, :)  is  generally  complicated  and 
hard  to  derive  in  closed  form  exactly  due  to  the  fact  that 
at  each  time  k,  the  number  of  SUs  that  are  on  channel 
m  changes  and  as  time  evolves,  the  complexity  increases. 
To  simplify  the  access  decision  structure,  we  assume  that 
the  access  decisions  are  made  based  only  on  the  current 
observations  yk(m,:).  Then,  for  the  m-th  primary  channel, 

Pm,o(.yk(m,:))  ’ 

and  Pm> o  to  denote 


the  likelihood  ratio  is  defined  as  £(y k(m, :))  = 
where  we  reuse  the  notations  P, 


m,  !■> 


the  conditional  probability  density  of  vector  y k(m,:)  given 
Sm{k)  =  0,  and  Sm(k )  =  1,  respectively. 

The  corresponding  Log-likelihood  ratio  is  given  by 

££R(yk(m, :))  =  £  neWm(fc)  ym,n  (k)  Cm,n  (k)  +  dm  (k), 

where  we  define  cm,n(k)  =  In 


and  dm{k)  —  X)neNm(fc)  ln  (  A °^Jk)  )■ 
statistic  is  J2ne>tm(k)  ym,n(k)cm,n(k). 


t(fc) 

-Am,n(fc)  "  !-AL,n(fc) 

1-aL  „(fe) 


)• 


A  sufficient 
i.e.,  the  test 


LJAR(^k(tn, :))  Tm{k)  is  equivalent  to  the  test 

^n6Xm(fc)  2/m,n(^)cm,n(^)  sjjfo  Tm{k)  —  dm(k)  =  Tm(k) . 

We  use  flm  k,  and  F nm  k  to  denote  the  conditional 
probability  mass  function  (pmf),  and  the  conditional 
cumulative  distribution  function  (cdf)  of  random  variable 
J2ne^m(k)  ym,n{k)Cm}n(k)  under  hypothesis  I respectively. 

Let  £  denote  the  collision  probability  constraint  on  each 
individual  primary  channel.  Then  T'm(k)  is  chosen  such 
*at:  1  -  <*(<,(*))  <  C  <  1  -  <*«,(*)  +  1).  The 
randomized  access  decision  rule  is  then  given  by 


f  1 

SNp(^k(m, :))  =  <  7 m(k) 

[  0 


ifE  2/m,n  (fc) 

Cm,n  (k)  >  T'm(k) 
tf  ym,n(k')crri^n(k')  —  T~rn(k') 
ifE  2/m,n  (fc) 

Cm,n  ( k )  <  T'm(k) 


where  the  decisions  1,  and  0  stand  for  access  and  do  not  access, 
respectively;  and  when  En6Nm(fc)  ym,n{k)cm,n{k)  =  r^(fc) 
access  with  probability  7 m(fc).  Therefore,  the  SSDC  decides 
to  access  channel  m  only  if  S]\ ip(^k(jn,:))  =  1,  where 
Snp  is  a  binomial  random  variable  with  a  probability  of 
success  equal  to  6np-  The  randomization  variable  is  given 

by  7 m(k)  =  Kk(T The  probability  of 
correctly  detecting  white  spaces  is  then  found  by  the  following 
(notice  that  Afc  defines  Nm(fc),  Vm): 

PD,m{k,  Afc)  =  PrjJjvp  =  1  |  Ml} 

=  +  (!) 

This  probability  of  detection  is  used  in  the  sensing  decision 
making  at  the  SSDC  as  described  in  the  next  section. 

B.  Optimal  and  myopic  sensing  decisions  at  the  SSDC 

The  objective  of  designing  the  sensing  decision  rule  is  to 
maximize  the  total  secondary  system  reward  on  all  channels 
accrued  over  time.  To  do  this,  let’s  first  define  bo(m,k)  = 
Pr {Sm(k)  =  0  |  yo:fc_i(m, :)},  and  &i(m,  k)  =  1  - 
bo(m,k)  as  the  belief  of  channel  m  being  idle  and  busy 
at  time  k,  given  the  observation  history  on  channel  m  up 
to  time  k  —  1,  respectively.  We  define  the  belief  vectors 
of  idle  and  busy  as:  bo(fc)  =  [&o(l,  &),•''  ,bo(M,k)]T 
and  bx(fc)  =  -  ,  &i(M,  k)]T .  At  time  k,  af¬ 

ter  obtaining  sensing  observations  from  all  SUs,  the  be¬ 
lief  of  the  channel  m  being  idle  in  next  time  slot  k  +  1 
is  updated  at  the  SSDC  using  Bayes’  formula:  bo(m,k  + 


1)  = 


_  5Zie{0,l}  Pi0  [ll  ne>fm(fc)  /i(2/m,n(k))]fr  i(m,k ) 


,  where  we  de- 


S»E{o,i}[n„eNm(fe)  fi(Vm,n(k))]bi(m,k) 
note  fi{ym,n{k))  =  Pr{2/m,n(/c)  |  Sm(k )  =  G  {0,1} 
as  the  conditional  pmf  of  SUs’  reports.  For  those  unsensed 
primary  channels,  the  belief  is  updated  based  on  the  Markovian 
evolution  of  primary  channels:  fc+1),  1 — 6i(m,  fc+1)]  = 

[bi(m,k),  1  —  bi(m,k)} P,  where  P  is  the  transition  matrix. 
The  belief  vectors  b0 (1) ,  and  b^l)  are  initialized  with  the 
stationary  distribution  7r  =  [7To  7Ti]  of  the  Markov  model  given 
by  7r  =  7rP. 

The  reward  function  for  channel  m,  at  time  k  is  defined  as: 
rm(k,  Afc)  =  Bm3{Sm(k)=o}^{SNp=i}’  where  we  define  Bm 
as  the  bandwidth  of  channel  m  and  Oe  is  the  indicator  function 
of  event  E.  The  expected  reward  for  channel  m  at  time  k 
is  then  given  by  E{rm(k,  Afe)}  =  Bmb0{m,k)PD:m{k,  Ak), 
where  PpjT7X(fc,  Afc)  is  defined  in  (1). 

We  define  the  vector  S(fc)  =  [Si  (&),•••  ,  Sm(^ )]  £  § 


as  the  state  of  the  system  at  time  k.  When  the  SUs  do  not 
have  perfect  knowledge  of  the  states  of  the  primary  channels, 
the  actual  state  of  the  system  is  the  belief  vector.  Smallwood 
and  Sondik  have  provided  in  [8]  an  algorithm  to  obtain  the 
optimal  decisions  for  this  POMDP  problem.  However,  when 
the  number  of  primary  channels  is  large,  the  algorithm  requires 
very  high  computational  complexity  and  the  solution  is  often 
intractable  [3]. 

As  an  alternative,  an  optimal  myopic  channel  sensing  deci¬ 
sion  can  be  defined  to  maximize  the  total  secondary  reward 
over  all  primary  channels  at  each  time  step.  This  myopic 
sensing  decision  A£  can  be  expressed  as: 


M 


A^  =  arg  max  >  Bmbo(m:  k^PE:m{k.  Afc), 


m=  1 


(2) 


under  constraint  Em=i  Afc(m,  n)  =  1. 

Because  the  entries  of  the  matrix  A},  can  only  have  value 
0  and  1,  the  objective  function  in  (2)  is  nonlinear,  and 
we  have  the  constraint  Em=i  Afc  (m,n)  =  1,  the  above 
optimization  problem  can  be  cast  as  a  constrained  nonlinear 
0-1  programming  problem  [9].  Since  the  objective  function  is 
non-separable,  the  solution  is  generally  hard  to  find.  To  find 
the  optimal  solution  A},  one  method  is  the  direct  search  which 
has  an  exponential  complexity  of  MN . 

As  an  alternative  with  much  lower  computational  complex¬ 
ity,  we  propose  a  suboptimal  algorithm  for  solving  (2)  by  using 
a  Hungarian  algorithm  [10]  iteratively.  For  simplicity,  we  drop 
the  time  indices  from  the  algorithm  description  and  we  let 
Bm  =  1.  We  assume  that  the  crossover  probabilities  of  the 
BACs  and  the  false  alarm  probability  are  given.  We  define 
the  M  x  N  matrix  A^m’n^  such  that  A ,n')  =  1  if 
( m',n ')  =  (m,n),  and  n')  =  0  otherwise.  Then, 

we  use  Algorithm  2  to  find  the  channel  sensing  assignment 
A,  which  provides  a  suboptimal  solution  to  (2). 

We  note  that  the  complexity  of  the  Hungarian  algorithm 
is  (max{M,  N})3  for  an  M  x  N  bipartite  graph.  Therefore, 
the  complexity  of  the  proposed  iterative  Hungarian  algorithm 
is  in  the  order  of  \  jj]  (max{M,  N})3  since  the  Hungarian 
algorithm  is  computed  iteratively  [  Aj  times.  In  brief,  the 
proposed  algorithm  can  solve  the  sensing  channel  assignment 
with  roughly  an  order  4  polynomial  complexity. 

In  particular,  if  N  <  M,  Algorithm  2  is  equivalent  to  the 
Hungarian  algorithm  which  provides  the  optimal  solution  to 
(2)  in  this  case. 

Algorithm  2  Iterative  Hungarian  Algorithm 

A  =  0Mxiv  and  M  =  {1,  •  •  •  ,  N} 

while  N  7^  0  do 

AP  =  Omxat 

for  m  G  {1,  •  •  •  ,  M}  and  nGkdo 

AP(m,  n)  =  [PD<m  (A  +  A M)  -  PD,m  (A)]  b0(m) 

end  for 

Run  the  Hungarian  algorithm  for  the  M  x  N  bipartite 

graph  whose  edge  weights  are  given  in  AP  to  obtain  the 

maximum  sum  matching. 

Remove  the  assigned  vertices  from  the  set  3\f. 

Append  the  new  assignments  to  matrix  A. 

end  while 


IV.  Simulation  Results  and  Discussions 

In  this  section,  we  compare  the  performance  of  our  proposed 
sensing/access  decisions  with  those  proposed  in  [5],  in  which 
at  each  time,  all  SUs  sense  one  single  primary  channel  with 
the  highest  belief  of  being  idle. 

In  order  to  directly  compare  the  performance  of  our  pro¬ 
posed  optimal  myopic  sensing  solution  with  the  results  of  [5], 
we  firstly  simulate  the  discounted  secondary  system  reward 
and  we  make  the  same  exact  assumptions  made  in  [5]:  1)  a 
discount  factor  0.999  is  assumed  for  time  horizon  from  0  to 
10000;  2)  the  SUs’  sensing  reports  to  the  SSDC  are  directly  the 
observations  rm^n(k)’ s;  3)  all  channel  coefficients  /imi„(/c)’s 
are  set  to  l’s  for  all  time;  4)  the  transition  probability  matrix 
of  the  Markov  channel  model:  P  =  [0.9  0.1;  0.2  0.8];  5) 
unit  bandwidth  for  all  primary  channels;  6)  the  constraint  on 
the  probability  of  collisions  with  PUs  is  £  =  0.1. 

In  Fig.  3,  we  show  the  simulation  results  of  discounted 
reward  based  on  2  primary  channels,  with  single  SU  and  also 
2  SUs,  for  SNRdB  =  201og10(4r),  from  -5  to  5 dB.  The 
performance  of  the  approach  proposed  in  [5]  is  exactly  regen¬ 
erated  in  this  figure  (2  primary  channels,  1  SU).  Since  when 
there  is  only  a  single  SU,  the  two  strategies  are  equivalent,  we 
see  that  the  optimal  allocation  of  SUs  and  the  approach  in  [5] 
give  the  same  discounted  reward.  However,  when  there  are  2 
SUs  (the  rest  of  assumptions  staying  the  same),  we  see  that 
our  proposed  approach  leads  to  a  higher  discounted  reward. 
This  is  because  when  all  SUs  are  allocated  to  sense  a  single 
channel,  SUs  lose  access  opportunities  on  the  other  channel. 


Time  steps  =  10000,  2  channels,  £=0.1 


Fig.  3.  Discounted  reward  comparison  between  our  proposed  method  and 
the  method  proposed  by  [5] 

Next,  we  compare  the  resulting  percentage  of  channel 
usage  (the  percentage  of  successful  access  out  of  total  white 
spaces  of  all  primary  channels)  of  our  proposed  sensing/access 
strategy  (direct- search  optimal  solution  and  solution  found 
by  using  iterative  Hungarian  algorithm)  to  the  one  in  [5] 
under  following  assumptions:  1)  no  discount  factor;  2)  SUs’ 
channel  sensing  reports  are  based  on  the  MAP  detector 
for  both  competitive  strategies  (reports  are  0’s  and  l’s); 
3)transition  probability  matrix  of  the  Markov  channel  model: 
P=  [0.9  0.1;  0.2  0.8];  4)  channel  coefficients  are  standard 

Gaussian  distributed:  ~  N(0, 1)  and  known  at  the 

SSDC  at  each  time;  5)  unit  bandwidth  for  all  primary  channels; 
5)  constraint  on  the  probability  of  collisions  with  PUs  is 
£  =  0.1.  As  shown  in  Fig.  4,  we  consider  two  scenarios:  1)  2 
primary  channels  and  3  SUs;  2)  3  primary  channels  and  1 1  SUs 
(without  showing  optimal  solution  due  to  high  computational 
complexity).  We  can  see  that  under  both  conditions,  our  pro¬ 


posed  approach  outperforms  the  approach  in  [5].  We  can  also 
see  that  when  the  ratio  of  the  number  of  SUs  to  the  number 
of  primary  channels  is  higher,  our  proposed  strategy  leads  to 
better  percentage  of  channel  usage;  whereas,  the  approach  used 
in  [5]  gives  worse  results  when  the  ratio  of  the  number  of 
SUs  over  the  number  of  primary  channels  goes  higher.  Indeed, 
since  the  approach  used  in  [5]  allocates  all  the  SUs  to  only 
one  channel  at  every  time  (the  channel  with  the  highest  belief 
of  being  idle),  the  higher  the  aforementioned  ratio  is,  the  more 
transmission  opportunities  on  primary  channels  are  wasted. 


Fig.  4.  Percentage  of  primary  channel  usage  comparing:  1)  brute  force 
optimal  solution;  2)  iterative  Hungarian  algorithm;  3)  all  SUs  on  a  single 
channel  at  a  time  [5]. 


V.  Conclusions 

We  proposed  and  established  a  universal  optimal  myopic 
channel  sensing  policy  in  the  case  of  a  centralized  Cognitive 
Radio  Communication  System  in  which  the  channel  sensing 
and  access  decisions  are  made  at  a  central  unit.  Unlike  other 
existing  approaches  proposed  in  the  literature,  our  universal 
optimal  myopic  channel  sensing  policy  is  more  realistic  be¬ 
cause  our  policy  solves  who-goes-to-where  problem  and  we 
introduced  the  channel  access  structure  dependency  explicitly 
(applying  the  Neyman-Pearson  detector  as  the  channel  access 
decision  rule  at  the  SSDC).  We  also  showed  that  our  approach 
outperforms  existing/possible  approaches. 

References 

[1]  “Report  of  the  spectrum  efficiency  working  group.”  FCC,  Tech.  Rep., 
Nov.  2002. 

[2]  T.  Yucek  and  H.  Arslan,  “A  survey  of  spectrum  sensing  algorithms  for 
cognitive  radio  applications,”  IEEE  Communications  Surveys  Tutorials , 
vol.  11,  no.  1,  pp.  116  -130,  Mai*.  2009. 

[3]  Q.  Zhao,  B.  Krishnamachari,  and  K.  Liu,  “On  myopic  sensing  for  multi¬ 
channel  opportunistic  access:  structure,  optimality,  and  performance,” 
IEEE  Transactions  on  Wireless  Communications,  vol.  7,  no.  12,  pp.  5431 
-5440,  Dec.  2008. 

[4]  L.  Rabiner,  “A  tutorial  on  hidden  markov  models  and  selected  applica¬ 
tions  in  speech  recognition,”  Proceedings  of  the  IEEE,  vol.  77,  no.  2, 
pp.  257  -286,  feb  1989. 

[5]  J.  Unnikrishnan  and  V.  Veeravalli,  “Algorithms  for  dynamic  spectrum 
access  with  learning  for  cognitive  radio,”  IEEE  Transactions  on  Signal 
Processing,  vol.  58,  no.  2,  pp.  750  -760,  Feb.  2010. 

[6]  T.  Javidi,  B.  Krishnamachari,  Q.  Zhao,  and  M.  Liu,  “Optimality  of  my¬ 
opic  sensing  in  multi-channel  opportunistic  access,”  IEEE  International 
Conference  on  Communications,  pp.  2107  -2112,  May.  2008. 

[7]  H.  V.  Poor,  An  Introduction  to  Signal  Detection  and  Estimation,  2nd  ed. 
New  York,  NY,  USA:  Springer- Verlag,  1994. 

[8]  R.  D.  Smallwood  and  E.  J.  Sondik,  “The  optimal  control  of  partially  ob¬ 
servable  Markov  processes  over  a  finite  horizon,”  Operations  Research, 
vol.  21,  no.  5,  pp.  1071  -1088,  Sept.-Oct.  1973. 

[9]  S.  X.  Li,  Duan,  Nonlinear  Integer  Programming,  1st  ed.  Springer, 
2010. 

[10]  H.  W.  Kuhn,  “The  Hungarian  method  for  the  assignment  problem,” 
Naval  Research  Logistics  Quarterly,  vol.  2,  pp.  83-97,  Mar.  1955. 


